In this paper, we propose a novel variable-length estimation approach for shape sensing of extensible soft robots utilizing fiber Bragg gratings (FBGs). Shape reconstruction from FBG sensors has been increasingly developed for soft robots, while the narrow stretching range of FBG fiber makes it difficult to acquire accurate sensing results for extensible robots. Towards this limitation, we newly introduce an FBG-based length sensor by leveraging a rigid curved channel, through which FBGs are allowed to slide within the robot following its body extension/compression, hence we can search and match the FBGs with specific constant curvature in the fiber to determine the effective length. From the fusion with the above measurements, a model-free filtering technique is accordingly presented for simultaneous calibration of a variable-length model and temporally continuous length estimation of the robot, enabling its accurate shape sensing using solely FBGs. The performances of the proposed method have been experimentally evaluated on an extensible soft robot equipped with an FBG fiber in both free and unstructured environments. The results concerning dynamic accuracy and robustness of length estimation and shape sensing demonstrate the effectiveness of our approach.
translated by 谷歌翻译
在本文中,我们提出了一种新颖的,通用的数据驱动方法,用于伺服控制连续机器人的3-D形状,并嵌入了纤维bragg光栅(FBG)传感器。 3D形状感知和控制技术的发展对于连续机器人在手术干预中自主执行任务至关重要。但是,由于连续机器人的非线性特性,主要难度在于它们的建模,尤其是对于具有可变刚度的软机器人。为了解决这个问题,我们通过利用FBG形状反馈和神经网络(NNS)提出了一个新的健壮自适应控制器,该反馈和神经网络(NNS)可以在线估算连续机器人的未知模型,并说明了意外的干扰以及NN近似错误,该错误表现出适应性行为对适应性行为呈现没有先验数据探索的未建模系统。基于新的复合适应算法,Lyapunov理论证明了具有NNS学习参数的闭环系统的渐近收敛。为了验证所提出的方法,我们通过使用两个连续机器人进行了一项全面的实验研究,这些连续机器人都与多核FBG集成,包括机器人辅助结肠镜和多部分可扩展的软操纵剂。结果表明,在各种非结构化环境以及幻影实验中,我们的控制器的可行性,适应性和优越性。
translated by 谷歌翻译
本文提出了一个最佳的运动计划框架,以自动生成多功能的四足动物跳跃运动(例如,翻转,旋转)。通过质心动力学的跳跃运动被配制为受机器人基诺动力约束的12维黑盒优化问题。基于梯度的方法在解决轨迹优化方面取得了巨大成功(TO),但是,需要先验知识(例如,参考运动,联系时间表),并导致次级最佳解决方案。新提出的框架首先采用了基于启发式的优化方法来避免这些问题。此外,针对机器人地面反作用力(GRF)计划中的基于启发式算法的算法创建了优先级的健身函数,增强收敛性和搜索性能。由于基于启发式的算法通常需要大量的时间,因此计划离线运动并作为运动前库存储。选择器旨在自动选择用用户指定或感知信息作为输入的动作。该框架仅通过几项具有挑战性的跳跃动作在开源迷你室中的简单连续跟踪PD控制器进行了成功验证,包括跳过30厘米高度的窗户形状的障碍物,并在矩形障碍物上与左悬挂式障碍物。 27厘米高。
translated by 谷歌翻译
可变形物体的形状控制是一个具有挑战性且重要的机器人问题。本文提出了一个基于模态分析的新型3D全局变形特征的无模型控制器。与使用几何功能的大多数现有控制器不同,我们的控制器通过将3D全局变形将其分解为低频模式形状,采用基于物理的变形功能。尽管模态分析在计算机视觉和仿真中被广泛采用,但尚未用于机器人变形控制中。我们为机器人操纵下的基于模态的变形控制开发了一个新的无模型框架。模式形状的物理解释使我们能够制定一个分析变形雅各布矩阵,将机器人操纵映射到模态特征的变化上。在Jacobian矩阵中,对象的未知几何形状和物理性质被视为低维模态参数,可用于线性地参数化闭环系统。因此,可以设计具有证实稳定性的自适应控制器,以使对象变形,同时在线估计模态参数。模拟和实验是在不同设置下使用线性,平面和实体对象进行的。结果不仅证实了我们的控制器的出色性能,而且还证明了其优势比基线方法。
translated by 谷歌翻译
在本文中,我们提出了一个迭代的自我训练框架,用于SIM到现实的6D对象姿势估计,以促进具有成本效益的机器人抓钩。给定bin选择场景,我们建立了一个光真实的模拟器来合成丰富的虚拟数据,并使用它来训练初始姿势估计网络。然后,该网络扮演教师模型的角色,该模型为未标记的真实数据生成了姿势预测。有了这些预测,我们进一步设计了一个全面的自适应选择方案,以区分可靠的结果,并将它们作为伪标签来更新学生模型以估算真实数据。为了不断提高伪标签的质量,我们通过将受过训练的学生模型作为新老师并使用精致的教师模型重新标记实际数据来迭代上述步骤。我们在公共基准和新发布的数据集上评估了我们的方法,分别提高了11.49%和22.62%的方法。我们的方法还能够将机器人箱的成功成功提高19.54%,这表明了对机器人应用的迭代SIM到现实解决方案的潜力。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译